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Transmitting Video 



The present invention is concerned with methods and apparatus for transmitting encoded video material 

over a network. ; 
5 According to one aspect of the present invention there is provided a method of transmitting encoded 

video over a network to a terminal, comprising: storing a plurality of encoded versions of the same 
" video sequence, wherein each version comprises a plurality of discrete portions of data and each 

version corresponds to a respective different degree of compression; ascertaining the data rate 

permitted by the network; ascertaining the state of a receiving buffer at the terminal; for each version, 
0 computing for discrete portions thereof as yet unsent the value of a timing error that would occur were 

any number of portions starting with that portion to be sent at the currently ascertained permitted rate; 

for each version, determining for each of at least some of the discrete portions thereof as yet unsent the 

maximum of the error values for that portion; for each version, comparing the determined maximum 

error value with the ascertained buffer state; selecting one of said versions for transmission, in 

dependence on the results of said comparisons; and 

transmitting the selected version. 

In another aspect, the invention provides a method of transmitting encoded video over a network to a 
terminal, comprising: storing a plurality of encoded versions of the same video sequence, wherein each 
version comprises a plurality of discrete portions of data and each version corresponds to a respective 
different degree of compression; for each version and for each of a plurality of nominal transmitting 
rates, computing for discrete portions thereof the value of a timing error that would occur were any 
•number of portions starting with that portion to be sent at the respective nominal rate; for each version 
and for each of said plurality of nominal transmitting rates, determining for each of at least some of the 
discrete portions thereof the maximum of the error values for that; storing said maximum error values; 
ascertaining the data rate permitted by the network; ascertaining the state of a receiving buffer at the 
terminal; for each version, using the ascertained permitted data rate and the stored maximum error 
values to estimate a maximum error value corresponding to said ascertained permitted^data rate; for 
each version, comparing the estimated maximum error value with the ascertained buffer state; selecting 
one of said versions for transmission, in dependence on the results of said comparisons; and 
transmitting the selected version. 

Further aspects of the invention are set out in the claims - ■;'*.'• . •« * ; ' 

Some-embodiments of the invention will now be described,- by way of example; with Reference td i the / 
accompanying drawings. : ,V: : :.-v. 
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M Figure 1, a stieamer 1 contains (or has access «o) a store 1 1 m which are stored files eachbeurg a 
compressed version of a video sequence, encoded using a conventional compression algontom such as 
tha ,define4mmeir U s toto dH.261orH.263,oroneofmeBO M PEOs M 4 1 rds.More 

particutaly, fire store 11 contains, for the same original video material, severe, files each encoded™* 
5 edaerentdegreeofcompression. k practice all me material could if desired.* store4. in one angle 
ffle Wformepurposesofdescriptionmey^Ubeassumedtohesepara.efiles. Thus Figure 1 shows 
torce such files: VI, encoded with a high degree of compression and hence low bit-rate, representing a 
low-quality recording; V2, encoded with a lesser degree of compression and hence higher taMate, 
representing a medium-quality recording; and V3, encoded with a low degree of compression and 
,. h enceevenmgherbit-rate,repre^ungamgh-quali^recording. Naturally one may store sumlar 
multiple recordings of farmer video sequences, but tbis is not important to the principles of operation. 
By "bit-rato" here is mean, the brt-rate generated by me original encoder and consumed by the ultimate 
decoder; in general tins is not tire same as the rat. a. which the streamer actoally transmits, which wtll 
• bereferred to as the transmitting bit-rate. It should also be noted that these files are generated 
15 variable bit-rate (VBR) - ma, is, the number of bits generated for any particular frame of tire vdeo 
depends on toe picture content Consequently, references above to low (etc.) bit-rate refer to the 
average bit-rate. 

The server has atraru^ ^ , 

fitter is conventional, perhaps operating with a well know protocol such. as TCP/IP. A control 

20 unit 13 serves in convent 

sequence, and to readpackets of data from Ihe store 1 for sending to the transmitter 12 as and when the 
transmitterisahletoreceivethem Here it is assumed that the data are read out as discrete packets, 
often one packet per frame of video, though the possibility of generating more than one packet for a 
' single frame is not excluded. (Whilst is in principle possible for a single packet to contain data for 
25 more than one frame, this is not usually of much interest in practice). 

Note that these packets are not necessarily related to any packet structure used on the network 2. 
The terminal 3 has a receiver 31, abuffer 32, primarily for accommodating short-term fluctuations in 
network delay and throughput, and a decoder 33. ^principle, the tenninal is conventional, though to 
- get full benefit from the use of the server, one might choose to use a terminal having a larger buffer 
30 than is usual. 

Some networks (including TCP/IP networks) h.ve me 

rate fluctuates according to the degree of loading on the network. The reason for providing alternate 
versions VI V2, V3 of one and the same video sequence isthat one. may choose a versron that the 



network is currently able to support. Another function of the control unit 13, therefore, is to interrogate 
the transmitter 12 to ascertain the transmitting data rate that is currently available, and take a decision 
as to which version to send. Here, as in many such systems, this is a dynamic process: during the 
course of a transmission the available rate is continually monitored so that as conditions improve (or 
5 deteriorate) the server may switch to a higher (or lower) quality version. Sometimes (as in TCP/TP) 
the available transmitting rate is not known until after transmission has begun; one solution is always to 
begin by sending the lowest-fate version and switch up if and when it becomes apparent that a higher 
quality version can be accommodated. 

Some systems employ additional versions of the video sequence representing transitional data which 
10 can be transmitted between the cessation of one version and the commencement of a different one, so 
as to bridge any incompatibility between the two versions. If required, this may be implemented, for 
example , in the manner described in our U.S. patent 6,002,440. 

In this description we will concentrate on the actual decision on if and when to switch. Conventional 
systems compare the available transmitting bit-rate with the average bit-rates of the versions available 
15 for transmission. We have recognised, however, that this is unsatisfactory for VBR systems because it 
leaves open the possibility that at some time in the future the available transmitting bit-rate will be 
insufficient to accommodate short-term fluctuations in instantaneous bit-rate as the latter Varies with 
picture content. Some theoretical discussion is in order at this point 

As shown in Figure 2, an encoded video sequence consists of N packets. Each packet has a header 
20 containing a time index tj (i=0 ... N-l) (in terms of real display time - e.g. this could be the video frame 
number) and contains bj bits. This analysis assumes that packet i must be completely received before it 
can be decoded arrives (i.e. one must buffer the whole packet first). 

In a simple case, each packet corresponds to one frame, and the time-stamps tj increase monotonically, 
that is, t M > t t for all i. If however (i) a frame can give rise to two or more packets (each with the 
25 same t}) then f /+1 > t ( . If (ii) frames can run out of capture-arid-display sequence (as in MPEG) then 
the t { do not increase monotonically 

These times are relative. Suppose the receiver has received packet 0 and starts decoding packet 0 at 
time t^ to. At "time now" of W+ tg the receiver has received packet tg (and possibly more packets 
too) and has just started to decode packet g. 

30 Packets g to h-1 are in the buffer. Note that(in the simple case) if h = g + 1 then the bitffer contains 
packet g only. At time W +tj the decoder is required to start decoding packet j. Therefore, at that time 
W + tjthe decoder will need to have received all packets up to and including packet j. 
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The time available from now up to W + *> + ^) - (W + tg) = tj - tg. (1) 
The data to be sent in that time are that for packets h to j, viz. 

b, 

which at a transmitting rate R will require a transmission duration . . 

i> (3) 

This is possible only if this transmission duration is less than or equal to me time available, i.e. when 
the currently available transmitting rate R satisfies the inequality 

t h bi (4) 
R 1 s 

10 Note that this is the condition for satisfactory reception and decoding of frame j: satisfactory 

transmission of the whole of the remaining sequence requires that this condition be satisfied for all j - 

h...N-l. 

For reasons that will become apparent, we rewrite Equation (4) as: 

i b i (5) 
R 



tj - h- x = £ ft - = £ where At { = 1 , - t t _, . 



15 Note that , ... _ 

i=h i;=fl 

Also, we define As, = (2>, / R) - Ar,- 

and r s = f H - t g ; note that T B is the difference between the time-stamp of the most recently received 
packet in the buffer and the time stamp of the least recently received packet in the buffer - i.e. the one 
that we have just started to decode. 

20 Then the condition is 

i^<r s . . ..... 
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For a successful transmission up to the last packet N-l, this condition must be satisfied for any possible 

• j,viz. . " 

Max%T l \t^^T B ( 7) 

The left-hand side of Equation (7) represents the maximum timing error that may occur from the 
5 transmission of packet h up to the end of the sequence, and the condition states, in effect that this error 
must not exceed the ability of the receiver buffer to accommodate it, given its current contents. For 
convenience, we will label the leftrhand side of Equation (7) as T h - i.e. 

T h = Maxj:r\ttej ■ (8) 

In practice we prefer to allow switching only at certain defined "switching points" in the sequence (and 
10 naturally provide the transiticmal data mentioned earlier only for such points). In that case the test 
needs to be performed only at such points. The switching decision at frame h may proceed as follows: 

• interrogate the transmitter 12 to determine the available transmitting rate R; 

• ascertain the current value of T B : this may be calculated at the terminal and transmitted to the server, or 
may be calculated at the server (see below; 

1^ compute (for each file VI, V2, V3) T h in accordance with Equation (8) - let these be called T h (l), 
T h (2),T h (3); 

• determine the highest value of k for which T h (k) + A < T B , where A is a fixed safety margin; 

• select file Vk for transmission. 

The calculation of T B at the server will depend on the exact method of streaming that is in use. 

20 Our preferred method is (as described our. in international patent application no. PCT/GB 

01/05246 [Agent's Ref. A26079]) to send, initially, video at the lowest quality, so that the terminal 
may immediately start decoding whilst at the same time the receiving buffer can be filling up because 
data is being sent at a higher rate than it is used. In this case the server can deduce current client 
session time (i.e. the timestamp of the packet currently being decoded at the terminal) without any 

25 feedback, and so 

T B = latest sent packet time - current client session time. 

If the system is arranged such that the terminal waits until some desired state of buffer fullness is 
_ jreached before playing begins, then .the -situation is not quite so simple because there is an additional - 
delay to take into account. If this delay is fixed, it can be included in the calculation. Similarly, if the 
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terminal calculates when to start playing and both the algorithm used, and the parameters used by the 
algorithm, are known by the server, again this can be taken into account. If however the terminal xs of 
unknown type, orconho^ 
needed. 

5 Now this procedure will workperfectly well, but does involve a considerable amount of processing 
that has to be carried out during the transmission process. In a modified implementation, therefore, we 
prefer to perform as much as possible of this computation in advance, ^principle this involves the 
calculation of T h (k) for every packet that follows a switching point, and storing this value in the packet 
header. Unfortunately, this calculation (Equation (8) and the definition of As 0 involves the value of R, 

10 whichisofcourseunknownatmetimeoftmspre-processing. Therefore we proceed by calculates 
T h (k) for a selection of possible values of R, for example (if Ra is the average bit rate of the file m 
question) 

Ri = 0.5R A 
R 2 = 0.7R A 

R 3 =Ra 

R 4 =l.3R A 

R 5 =2R A 

SoeachpackethhasmesefiveprecalculatedvaluesofT.storedinit. If required (for the purposes to 
15 be discussed below) one may also store the relative time position at which the maximum in Equahon 
(8)) occurs, that is, 

AW = W -h Where W is the value of j in Equation 8 for whichT, is obtained. 

In this case the switching decision at frame h proceeds as follows: 
. interrogate the transmitter 12 to determine the available transmitting rate R 
20> ascertain the current value ofT B , as before; 
. EITHER - in the event that R corresponds to one of the rates for which T h has been precalculated - read 

this value from the store (for each file VI , V2, V3); 
. OR - in the event that R does not ? o correspond, read from the store the value of T b (and, if required, t> 
^ that correspond to the highest one (R) of the rates R,..Rs that is less than the actual value of R, 
25 and estimate T„ from it (again, for each file VI, V2, V3); 
. deternune&eMghest valued 
• select file V k for transmission. 
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The estimate of T h could be performed simply by using the value T h " associated with K; this would 
work, but since it would overestimate T h it would result, at times, in a switch to a higher quality stream 
being judged impossible even though it were possible. Another option would be by linear (or other) 
interpolation between the values of T h stored for the two values of Ri ... R 5 each side of the actual 
5 value R. However, our preferred approach is to calculate an estimate according to: 

R 

Where R" is the highest one of the rates R1...R5 that is less than the actual value of R, Tf is the 
precalculated T h for this rate, is the time from tf at which!]" is obtained (i.e. is the accompanying 

value of A/^j^ .. In the event that this method returns a negative value, we set it to zero. 

10 Note that this is only an estimate, as Th is a nonlinear function of rate. However with this method V is 
always higher than the true value and automatically provides a safety margin (so that the margin A 
shown above may be omitted. 

Note that these equations are valid for the situation where the encoding process generates two or more 
packets (with equal tj) for one frame, and for the situation encountered in MPEG with bidirectional 
15 prediction where the frames are transmitted in the order in which they need to be decoded, rather than 
in order of ascending Tj. 
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Claims 



1 . A method of transmitting encoded video over a network to a terminal, comprising 
5 storing a plurality of encoded versions of the same video sequence, wherein each version comprises a 
plurality of discrete portions of data and each version corresponds to a respective different degree of 
compression; 

- ascertaining the data rate permitted by the network; 
ascertaining the state of a receiving buffer at the terminal; 
10 for each version, computing for discrete portions thereof as yet unsent the value of a timing error that 
would occur were any number of portions starting with that portion to be sent at the currently 
ascertained permitted rate; 

for each version, determining for each of at least some of the discrete portions thereof as yet unsent the 
maximum of the error values for that portion; 
15 for each version, comparing the determined maximum error value with the ascertained buffer state; 
selecting one of said versions for transmission, in dependence on the results of said comparisons; and 
transmitting the selected version. 



20 



2. A method of transmitting encoded video over a network to a terminal, comprising 
storing a plurality of encoded versions of the same video sequence, wherein each version comprises a 
plurality of discrete portions of data and each version corresponds to a respective different degree of 
compression; 

for each version and for each of a plurality of nominal transmitting rates, computing for discrete 
25 portions thereof the value of a timing error that .would occur were any number of portions starting with 
that portion to be sent at the respective nominal rate; 

for each version and for each of said plurality of nominal transmitting rates, determining for each of at 
least some of the discrete portions thereof the maximum of the error values for that; 
storing said maximum error values; 
30 ascertaining the data rate permitted by the network; 

ascertaiding the state of a receiving buffer at the terminal; 

for each version, using the ascertained permitted data rate and the stored maximum error values to 

estimate a maximum error value corresponding to said ascertained permitted data rate; 

for each version,- comparing-the estimated maximum -error -value with the ascertained buffer state; — 
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selecting one of said versions for transmission, in dependence on the results of said comparisons; and 
transmitting the selected version. 

5 3. A method according to claim 1 or 2 in which said maximum timing error determination is 
performed only for selected ones of said portions at which a version" change is to be permitted. 

4. A method according to claim 1, 2 or 3 in which each computed timing error value is the 
' difference between (a) the time needed to transmit, at the relevant rate, the portion in question and zero 
10 or more consecutive subsequent portions up to and including any particular portion, and (b) the . 

difference between the playing instant of the respective particular portion and the playing instant of the 
portion preceding the portion in question. 

15 5. A video recording stored on a carrier, comprising 

a plurality of encoded versions of the same video sequence, wherein each version comprises a plurality 
of discrete portions of data and each version corresponds to a respective different degree of 
compression; and 

for each discrete portion of each version and for each of a plurality of nominal transmitting rates, a 
20 maximum error Value for that portion, being the maximum of (a) the value of a timing error that would 
occur were that portion to be sent at the respective nominal rate; and 

(b) the values of a timing error that would occur were that portion and any number of subsequent 
portions subsequent thereto to be sent at the respective nominal rate. 
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